Dataset statistics
| Number of variables | 14 |
|---|---|
| Number of observations | 3790 |
| Missing cells | 3956 |
| Missing cells (%) | 7.5% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 414.7 KiB |
| Average record size in memory | 112.0 B |
Variable types
| NUM | 9 |
|---|---|
| CAT | 4 |
| BOOL | 1 |
Reproduction
| Analysis started | 2020-07-10 12:38:32.755950 |
|---|---|
| Analysis finished | 2020-07-10 12:38:48.603436 |
| Duration | 15.85 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
headquarter has constant value "0" | Constant |
date_of_establishment has a high cardinality: 843 distinct values | High cardinality |
location has a high cardinality: 1489 distinct values | High cardinality |
loc.details has a high cardinality: 299 distinct values | High cardinality |
location.Code is highly correlated with id | High correlation |
id is highly correlated with location.Code | High correlation |
deposit_amount_2012 is highly correlated with deposit_amount_2011 and 5 other fields | High correlation |
deposit_amount_2011 is highly correlated with deposit_amount_2012 and 5 other fields | High correlation |
deposit_amount_2013 is highly correlated with deposit_amount_2011 and 5 other fields | High correlation |
deposit_amount_2014 is highly correlated with deposit_amount_2011 and 5 other fields | High correlation |
deposit_amount_2015 is highly correlated with deposit_amount_2011 and 5 other fields | High correlation |
deposit_amount_2016 is highly correlated with deposit_amount_2011 and 5 other fields | High correlation |
deposit_amount_2017 is highly correlated with deposit_amount_2011 and 5 other fields | High correlation |
date_of_establishment has 2040 (53.8%) missing values | Missing |
deposit_amount_2011 has 740 (19.5%) missing values | Missing |
deposit_amount_2012 has 578 (15.3%) missing values | Missing |
deposit_amount_2013 has 329 (8.7%) missing values | Missing |
deposit_amount_2014 has 175 (4.6%) missing values | Missing |
deposit_amount_2015 has 56 (1.5%) missing values | Missing |
deposit_amount_2011 is highly skewed (γ1 = 54.23092623) | Skewed |
deposit_amount_2012 is highly skewed (γ1 = 55.80428776) | Skewed |
deposit_amount_2013 is highly skewed (γ1 = 57.73524155) | Skewed |
deposit_amount_2014 is highly skewed (γ1 = 58.94451219) | Skewed |
deposit_amount_2015 is highly skewed (γ1 = 59.86730287) | Skewed |
deposit_amount_2016 is highly skewed (γ1 = 60.28538584) | Skewed |
deposit_amount_2017 is highly skewed (γ1 = 60.28538584) | Skewed |
id has unique values | Unique |
location.Code has unique values | Unique |
deposit_amount_2011 has 47 (1.2%) zeros | Zeros |
deposit_amount_2012 has 45 (1.2%) zeros | Zeros |
deposit_amount_2013 has 49 (1.3%) zeros | Zeros |
deposit_amount_2014 has 50 (1.3%) zeros | Zeros |
deposit_amount_2015 has 51 (1.3%) zeros | Zeros |
deposit_amount_2016 has 50 (1.3%) zeros | Zeros |
deposit_amount_2017 has 50 (1.3%) zeros | Zeros |
| Distinct count | 3790 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1895.5 |
|---|---|
| Minimum | 1 |
| Maximum | 3790 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 29.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 190.45 |
| Q1 | 948.25 |
| median | 1895.5 |
| Q3 | 2842.75 |
| 95-th percentile | 3600.55 |
| Maximum | 3790 |
| Range | 3789 |
| Interquartile range (IQR) | 1894.5 |
Descriptive statistics
| Standard deviation | 1094.223088 |
|---|---|
| Coefficient of variation (CV) | 0.5772741167 |
| Kurtosis | -1.2 |
| Mean | 1895.5 |
| Median Absolute Deviation (MAD) | 947.5 |
| Skewness | 0 |
| Sum | 7183945 |
| Variance | 1197324.167 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 2047 | 1 | < 0.1% | |
| 621 | 1 | < 0.1% | |
| 645 | 1 | < 0.1% | |
| 2692 | 1 | < 0.1% | |
| 641 | 1 | < 0.1% | |
| 2688 | 1 | < 0.1% | |
| 637 | 1 | < 0.1% | |
| 2684 | 1 | < 0.1% | |
| 633 | 1 | < 0.1% | |
| 2680 | 1 | < 0.1% | |
| Other values (3780) | 3780 | 99.7% |
| Value | Count | Frequency (%) | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% | |
| 5 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 3790 | 1 | < 0.1% | |
| 3789 | 1 | < 0.1% | |
| 3788 | 1 | < 0.1% | |
| 3787 | 1 | < 0.1% | |
| 3786 | 1 | < 0.1% |
| Distinct count | 1 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 29.6 KiB |
| 0 |
|---|
| Value | Count | Frequency (%) | |
| 0 | 3790 | 100.0% |
| Distinct count | 3790 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5395.806332453826 |
|---|---|
| Minimum | 2871 |
| Maximum | 7994 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 29.6 KiB |
Quantile statistics
| Minimum | 2871 |
|---|---|
| 5-th percentile | 3078.45 |
| Q1 | 4067.25 |
| median | 5261.5 |
| Q3 | 6863.25 |
| 95-th percentile | 7779.55 |
| Maximum | 7994 |
| Range | 5123 |
| Interquartile range (IQR) | 2796 |
Descriptive statistics
| Standard deviation | 1549.105135 |
|---|---|
| Coefficient of variation (CV) | 0.2870942802 |
| Kurtosis | -1.284903617 |
| Mean | 5395.806332 |
| Median Absolute Deviation (MAD) | 1409 |
| Skewness | 0.09235452196 |
| Sum | 20450106 |
| Variance | 2399726.72 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 4094 | 1 | < 0.1% | |
| 2884 | 1 | < 0.1% | |
| 2900 | 1 | < 0.1% | |
| 4947 | 1 | < 0.1% | |
| 6994 | 1 | < 0.1% | |
| 2896 | 1 | < 0.1% | |
| 4943 | 1 | < 0.1% | |
| 6990 | 1 | < 0.1% | |
| 2892 | 1 | < 0.1% | |
| 4939 | 1 | < 0.1% | |
| Other values (3780) | 3780 | 99.7% |
| Value | Count | Frequency (%) | |
| 2871 | 1 | < 0.1% | |
| 2872 | 1 | < 0.1% | |
| 2873 | 1 | < 0.1% | |
| 2874 | 1 | < 0.1% | |
| 2875 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 7994 | 1 | < 0.1% | |
| 7993 | 1 | < 0.1% | |
| 7989 | 1 | < 0.1% | |
| 7987 | 1 | < 0.1% | |
| 7986 | 1 | < 0.1% |
| Distinct count | 843 |
|---|---|
| Unique (%) | 48.2% |
| Missing | 2040 |
| Missing (%) | 53.8% |
| Memory size | 29.6 KiB |
| 1920-01-01 | 129 |
|---|---|
| 1890-01-01 | 111 |
| 1966-05-05 | 32 |
| 1935-01-11 | 30 |
| 2004-01-07 | 27 |
| Other values (838) |
| Value | Count | Frequency (%) | |
| 1920-01-01 | 129 | 3.4% | |
| 1890-01-01 | 111 | 2.9% | |
| 1966-05-05 | 32 | 0.8% | |
| 1935-01-11 | 30 | 0.8% | |
| 2004-01-07 | 27 | 0.7% | |
| 1924-01-01 | 27 | 0.7% | |
| 1935-01-07 | 26 | 0.7% | |
| 1934-01-12 | 17 | 0.4% | |
| 1908-01-01 | 14 | 0.4% | |
| 1916-12-31 | 14 | 0.4% | |
| Other values (833) | 1323 | 34.9% | |
| (Missing) | 2040 | 53.8% |
Length
| Max length | 10 |
|---|---|
| Median length | 3 |
| Mean length | 6.232189974 |
| Min length | 3 |
| Distinct count | 1489 |
|---|---|
| Unique (%) | 39.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 29.6 KiB |
| Chicago | 99 |
|---|---|
| New York City | 76 |
| Houston | 72 |
| Los Angeles | 58 |
| Indianapolis | 49 |
| Other values (1484) |
| Value | Count | Frequency (%) | |
| Chicago | 99 | 2.6% | |
| New York City | 76 | 2.0% | |
| Houston | 72 | 1.9% | |
| Los Angeles | 58 | 1.5% | |
| Indianapolis | 49 | 1.3% | |
| Miami | 44 | 1.2% | |
| San Francisco | 42 | 1.1% | |
| Brooklyn | 40 | 1.1% | |
| Seattle | 36 | 0.9% | |
| San Diego | 35 | 0.9% | |
| Other values (1479) | 3239 | 85.5% |
Length
| Max length | 22 |
|---|---|
| Median length | 9 |
| Mean length | 9.158047493 |
| Min length | 4 |
| Distinct count | 299 |
|---|---|
| Unique (%) | 7.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 29.6 KiB |
| Los Angeles | 298 |
|---|---|
| Cook | 159 |
| Orange | 149 |
| Harris | 102 |
| San Diego | 92 |
| Other values (294) |
| Value | Count | Frequency (%) | |
| Los Angeles | 298 | 7.9% | |
| Cook | 159 | 4.2% | |
| Orange | 149 | 3.9% | |
| Harris | 102 | 2.7% | |
| San Diego | 92 | 2.4% | |
| Maricopa | 90 | 2.4% | |
| King | 86 | 2.3% | |
| Miami-Dade | 82 | 2.2% | |
| New York | 76 | 2.0% | |
| Clark | 71 | 1.9% | |
| Other values (289) | 2585 | 68.2% |
Length
| Max length | 20 |
|---|---|
| Median length | 7 |
| Mean length | 7.517150396 |
| Min length | 3 |
state
Categorical
| Distinct count | 25 |
|---|---|
| Unique (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 29.6 KiB |
| CA | |
|---|---|
| NY | |
| FL | |
| TX | |
| IL | 252 |
| Other values (20) |
| Value | Count | Frequency (%) | |
| CA | 1003 | 26.5% | |
| NY | 425 | 11.2% | |
| FL | 391 | 10.3% | |
| TX | 378 | 10.0% | |
| IL | 252 | 6.6% | |
| WA | 204 | 5.4% | |
| NJ | 179 | 4.7% | |
| IN | 177 | 4.7% | |
| CO | 114 | 3.0% | |
| OR | 113 | 3.0% | |
| Other values (15) | 554 | 14.6% |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
| Distinct count | 2955 |
|---|---|
| Unique (%) | 96.9% |
| Missing | 740 |
| Missing (%) | 19.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 168320.07688524592 |
|---|---|
| Minimum | 0.0 |
| Maximum | 230365992.0 |
| Zeros | 47 |
| Zeros (%) | 1.2% |
| Memory size | 29.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 7801.65 |
| Q1 | 28398 |
| median | 53442 |
| Q3 | 99109.125 |
| 95-th percentile | 222504.9 |
| Maximum | 230365992 |
| Range | 230365992 |
| Interquartile range (IQR) | 70711.125 |
Descriptive statistics
| Standard deviation | 4196386.456 |
|---|---|
| Coefficient of variation (CV) | 24.9309918 |
| Kurtosis | 2973.048844 |
| Mean | 168320.0769 |
| Median Absolute Deviation (MAD) | 30817.5 |
| Skewness | 54.23092623 |
| Sum | 513376234.5 |
| Variance | 1.760965929e+13 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 47 | 1.2% | |
| 25234.5 | 3 | 0.1% | |
| 46299 | 2 | 0.1% | |
| 9663 | 2 | 0.1% | |
| 10548 | 2 | 0.1% | |
| 40938 | 2 | 0.1% | |
| 179181 | 2 | 0.1% | |
| 49932 | 2 | 0.1% | |
| 28243.5 | 2 | 0.1% | |
| 55513.5 | 2 | 0.1% | |
| Other values (2945) | 2984 | 78.7% | |
| (Missing) | 740 | 19.5% |
| Value | Count | Frequency (%) | |
| 0 | 47 | 1.2% | |
| 156 | 1 | < 0.1% | |
| 172.5 | 1 | < 0.1% | |
| 274.5 | 1 | < 0.1% | |
| 562.5 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 230365992 | 1 | < 0.1% | |
| 22813977 | 1 | < 0.1% | |
| 7374436.5 | 1 | < 0.1% | |
| 5990880 | 1 | < 0.1% | |
| 5568664.5 | 1 | < 0.1% |
| Distinct count | 3104 |
|---|---|
| Unique (%) | 96.6% |
| Missing | 578 |
| Missing (%) | 15.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 188270.4662204234 |
|---|---|
| Minimum | 0.0 |
| Maximum | 291582000.0 |
| Zeros | 45 |
| Zeros (%) | 1.2% |
| Memory size | 29.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 7042.275 |
| Q1 | 30199.125 |
| median | 55774.5 |
| Q3 | 100420.5 |
| 95-th percentile | 232479.15 |
| Maximum | 291582000 |
| Range | 291582000 |
| Interquartile range (IQR) | 70221.375 |
Descriptive statistics
| Standard deviation | 5171072.99 |
|---|---|
| Coefficient of variation (CV) | 27.46619315 |
| Kurtosis | 3143.282123 |
| Mean | 188270.4662 |
| Median Absolute Deviation (MAD) | 31239.75 |
| Skewness | 55.80428776 |
| Sum | 604724737.5 |
| Variance | 2.673999587e+13 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 45 | 1.2% | |
| 30319.5 | 3 | 0.1% | |
| 68905.5 | 3 | 0.1% | |
| 23964 | 2 | 0.1% | |
| 12310.5 | 2 | 0.1% | |
| 161722.5 | 2 | 0.1% | |
| 75295.5 | 2 | 0.1% | |
| 22470 | 2 | 0.1% | |
| 22938 | 2 | 0.1% | |
| 53913 | 2 | 0.1% | |
| Other values (3094) | 3147 | 83.0% | |
| (Missing) | 578 | 15.3% |
| Value | Count | Frequency (%) | |
| 0 | 45 | 1.2% | |
| 4.5 | 1 | < 0.1% | |
| 117 | 1 | < 0.1% | |
| 180 | 1 | < 0.1% | |
| 213 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 291582000 | 1 | < 0.1% | |
| 26306682 | 1 | < 0.1% | |
| 8979693 | 1 | < 0.1% | |
| 7189416 | 1 | < 0.1% | |
| 6691308 | 1 | < 0.1% |
| Distinct count | 3356 |
|---|---|
| Unique (%) | 97.0% |
| Missing | 329 |
| Missing (%) | 8.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 193380.30424732735 |
|---|---|
| Minimum | 0.0 |
| Maximum | 311051982.0 |
| Zeros | 49 |
| Zeros (%) | 1.3% |
| Memory size | 29.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 8251.5 |
| Q1 | 31597.5 |
| median | 59616 |
| Q3 | 107244 |
| 95-th percentile | 251787 |
| Maximum | 311051982 |
| Range | 311051982 |
| Interquartile range (IQR) | 75646.5 |
Descriptive statistics
| Standard deviation | 5320718.404 |
|---|---|
| Coefficient of variation (CV) | 27.51427259 |
| Kurtosis | 3370.609983 |
| Mean | 193380.3042 |
| Median Absolute Deviation (MAD) | 33573 |
| Skewness | 57.73524155 |
| Sum | 669289233 |
| Variance | 2.831004433e+13 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 49 | 1.3% | |
| 50184 | 2 | 0.1% | |
| 62122.5 | 2 | 0.1% | |
| 62671.5 | 2 | 0.1% | |
| 31689 | 2 | 0.1% | |
| 76456.5 | 2 | 0.1% | |
| 17676 | 2 | 0.1% | |
| 80827.5 | 2 | 0.1% | |
| 30180 | 2 | 0.1% | |
| 4477.5 | 2 | 0.1% | |
| Other values (3346) | 3394 | 89.6% | |
| (Missing) | 329 | 8.7% |
| Value | Count | Frequency (%) | |
| 0 | 49 | 1.3% | |
| 82.5 | 1 | < 0.1% | |
| 142.5 | 1 | < 0.1% | |
| 156 | 1 | < 0.1% | |
| 198 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 311051982 | 1 | < 0.1% | |
| 31898808 | 1 | < 0.1% | |
| 9417007.5 | 1 | < 0.1% | |
| 9172369.5 | 1 | < 0.1% | |
| 5943204 | 1 | < 0.1% |
| Distinct count | 3504 |
|---|---|
| Unique (%) | 96.9% |
| Missing | 175 |
| Missing (%) | 4.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 204574.2684647303 |
|---|---|
| Minimum | 0.0 |
| Maximum | 335093029.5 |
| Zeros | 50 |
| Zeros (%) | 1.3% |
| Memory size | 29.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 10120.05 |
| Q1 | 34971.75 |
| median | 63537 |
| Q3 | 114528.75 |
| 95-th percentile | 266363.7 |
| Maximum | 335093029.5 |
| Range | 335093029.5 |
| Interquartile range (IQR) | 79557 |
Descriptive statistics
| Standard deviation | 5610535.906 |
|---|---|
| Coefficient of variation (CV) | 27.42542328 |
| Kurtosis | 3515.557103 |
| Mean | 204574.2685 |
| Median Absolute Deviation (MAD) | 34978.5 |
| Skewness | 58.94451219 |
| Sum | 739535980.5 |
| Variance | 3.147811315e+13 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 50 | 1.3% | |
| 51912 | 3 | 0.1% | |
| 52413 | 3 | 0.1% | |
| 104863.5 | 2 | 0.1% | |
| 31773 | 2 | 0.1% | |
| 20734.5 | 2 | 0.1% | |
| 105906 | 2 | 0.1% | |
| 81633 | 2 | 0.1% | |
| 46066.5 | 2 | 0.1% | |
| 24723 | 2 | 0.1% | |
| Other values (3494) | 3545 | 93.5% | |
| (Missing) | 175 | 4.6% |
| Value | Count | Frequency (%) | |
| 0 | 50 | 1.3% | |
| 108 | 1 | < 0.1% | |
| 229.5 | 1 | < 0.1% | |
| 274.5 | 1 | < 0.1% | |
| 364.5 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 335093029.5 | 1 | < 0.1% | |
| 34449870 | 1 | < 0.1% | |
| 12502327.5 | 1 | < 0.1% | |
| 9670909.5 | 1 | < 0.1% | |
| 7888276.5 | 1 | < 0.1% |
| Distinct count | 3642 |
|---|---|
| Unique (%) | 97.5% |
| Missing | 56 |
| Missing (%) | 1.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 218387.40747188003 |
|---|---|
| Minimum | 0.0 |
| Maximum | 362310873.0 |
| Zeros | 51 |
| Zeros (%) | 1.3% |
| Memory size | 29.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 12103.425 |
| Q1 | 39358.5 |
| median | 70158 |
| Q3 | 124944.75 |
| 95-th percentile | 284444.025 |
| Maximum | 362310873 |
| Range | 362310873 |
| Interquartile range (IQR) | 85586.25 |
Descriptive statistics
| Standard deviation | 5970415.949 |
|---|---|
| Coefficient of variation (CV) | 27.33864566 |
| Kurtosis | 3627.436235 |
| Mean | 218387.4075 |
| Median Absolute Deviation (MAD) | 37224.75 |
| Skewness | 59.86730287 |
| Sum | 815458579.5 |
| Variance | 3.56458666e+13 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 51 | 1.3% | |
| 57706.5 | 2 | 0.1% | |
| 63385.5 | 2 | 0.1% | |
| 40044 | 2 | 0.1% | |
| 25114.5 | 2 | 0.1% | |
| 49062 | 2 | 0.1% | |
| 34950 | 2 | 0.1% | |
| 11340 | 2 | 0.1% | |
| 70731 | 2 | 0.1% | |
| 15379.5 | 2 | 0.1% | |
| Other values (3632) | 3665 | 96.7% | |
| (Missing) | 56 | 1.5% |
| Value | Count | Frequency (%) | |
| 0 | 51 | 1.3% | |
| 51 | 1 | < 0.1% | |
| 769.5 | 1 | < 0.1% | |
| 928.5 | 1 | < 0.1% | |
| 1026 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 362310873 | 1 | < 0.1% | |
| 39260689.5 | 1 | < 0.1% | |
| 10713843 | 1 | < 0.1% | |
| 10233148.5 | 1 | < 0.1% | |
| 7121397 | 1 | < 0.1% |
| Distinct count | 3672 |
|---|---|
| Unique (%) | 97.4% |
| Missing | 19 |
| Missing (%) | 0.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 236442.20365950678 |
|---|---|
| Minimum | 0.0 |
| Maximum | 391939125.0 |
| Zeros | 50 |
| Zeros (%) | 1.3% |
| Memory size | 29.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 17377.5 |
| Q1 | 46321.5 |
| median | 78774 |
| Q3 | 137349 |
| 95-th percentile | 306828.75 |
| Maximum | 391939125 |
| Range | 391939125 |
| Interquartile range (IQR) | 91027.5 |
Descriptive statistics
| Standard deviation | 6422120.282 |
|---|---|
| Coefficient of variation (CV) | 27.16148041 |
| Kurtosis | 3674.143162 |
| Mean | 236442.2037 |
| Median Absolute Deviation (MAD) | 40014 |
| Skewness | 60.28538584 |
| Sum | 891623550 |
| Variance | 4.124362892e+13 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 50 | 1.3% | |
| 21331.5 | 2 | 0.1% | |
| 109528.5 | 2 | 0.1% | |
| 35719.5 | 2 | 0.1% | |
| 32496 | 2 | 0.1% | |
| 86626.5 | 2 | 0.1% | |
| 34861.5 | 2 | 0.1% | |
| 76015.5 | 2 | 0.1% | |
| 43893 | 2 | 0.1% | |
| 110755.5 | 2 | 0.1% | |
| Other values (3662) | 3703 | 97.7% | |
| (Missing) | 19 | 0.5% |
| Value | Count | Frequency (%) | |
| 0 | 50 | 1.3% | |
| 378 | 1 | < 0.1% | |
| 598.5 | 1 | < 0.1% | |
| 985.5 | 1 | < 0.1% | |
| 1213.5 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 391939125 | 1 | < 0.1% | |
| 40416823.5 | 1 | < 0.1% | |
| 10373826 | 1 | < 0.1% | |
| 10013170.5 | 1 | < 0.1% | |
| 7904286 | 1 | < 0.1% |
| Distinct count | 3672 |
|---|---|
| Unique (%) | 97.4% |
| Missing | 19 |
| Missing (%) | 0.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 354663.3054892601 |
|---|---|
| Minimum | 0.0 |
| Maximum | 587908687.5 |
| Zeros | 50 |
| Zeros (%) | 1.3% |
| Memory size | 29.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 26066.25 |
| Q1 | 69482.25 |
| median | 118161 |
| Q3 | 206023.5 |
| 95-th percentile | 460243.125 |
| Maximum | 587908687.5 |
| Range | 587908687.5 |
| Interquartile range (IQR) | 136541.25 |
Descriptive statistics
| Standard deviation | 9633180.424 |
|---|---|
| Coefficient of variation (CV) | 27.16148041 |
| Kurtosis | 3674.143162 |
| Mean | 354663.3055 |
| Median Absolute Deviation (MAD) | 60021 |
| Skewness | 60.28538584 |
| Sum | 1337435325 |
| Variance | 9.279816507e+13 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 50 | 1.3% | |
| 42367.5 | 2 | 0.1% | |
| 209866.5 | 2 | 0.1% | |
| 234477 | 2 | 0.1% | |
| 98813.25 | 2 | 0.1% | |
| 38684.25 | 2 | 0.1% | |
| 99918 | 2 | 0.1% | |
| 76396.5 | 2 | 0.1% | |
| 49988.25 | 2 | 0.1% | |
| 53579.25 | 2 | 0.1% | |
| Other values (3662) | 3703 | 97.7% | |
| (Missing) | 19 | 0.5% |
| Value | Count | Frequency (%) | |
| 0 | 50 | 1.3% | |
| 567 | 1 | < 0.1% | |
| 897.75 | 1 | < 0.1% | |
| 1478.25 | 1 | < 0.1% | |
| 1820.25 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 587908687.5 | 1 | < 0.1% | |
| 60625235.25 | 1 | < 0.1% | |
| 15560739 | 1 | < 0.1% | |
| 15019755.75 | 1 | < 0.1% | |
| 11856429 | 1 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| id | headquarter | location.Code | date_of_establishment | location | loc.details | state | deposit_amount_2011 | deposit_amount_2012 | deposit_amount_2013 | deposit_amount_2014 | deposit_amount_2015 | deposit_amount_2016 | deposit_amount_2017 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 0 | 2871 | 1911-06-02 | Wales | Waukesha | WI | 32079.0 | 35971.5 | 37237.5 | 40362.0 | 46021.5 | 46020.0 | 69030.00 |
| 1 | 2 | 0 | 2872 | NaN | Germantown | Washington | WI | 83181.0 | 84846.0 | 97098.0 | 110284.5 | 122035.5 | 133905.0 | 200857.50 |
| 2 | 3 | 0 | 2873 | 1908-06-04 | Brookfield | Waukesha | WI | 136323.0 | 156450.0 | 187557.0 | 188859.0 | 198751.5 | 206044.5 | 309066.75 |
| 3 | 4 | 0 | 2874 | NaN | Pewaukee | Waukesha | WI | 68511.0 | 73932.0 | 79876.5 | 105603.0 | 112113.0 | 110755.5 | 166133.25 |
| 4 | 5 | 0 | 2875 | NaN | Waukesha | Waukesha | WI | 96271.5 | 108325.5 | 104880.0 | 121054.5 | 113956.5 | 109837.5 | 164756.25 |
| 5 | 6 | 0 | 2876 | NaN | Waukesha | Waukesha | WI | 93837.0 | 101592.0 | 118270.5 | 140280.0 | 150987.0 | 168742.5 | 253113.75 |
| 6 | 7 | 0 | 2877 | NaN | Brookfield | Waukesha | WI | 117655.5 | 130725.0 | 153216.0 | 179154.0 | 199660.5 | 214266.0 | 321399.00 |
| 7 | 8 | 0 | 2878 | 1961-04-01 | New Berlin | Waukesha | WI | 126933.0 | 144072.0 | 155919.0 | 164754.0 | 181075.5 | 184749.0 | 277123.50 |
| 8 | 9 | 0 | 2879 | 1933-02-05 | Oconomowoc | Waukesha | WI | 72700.5 | 73044.0 | 82053.0 | 85413.0 | 83767.5 | 87390.0 | 131085.00 |
| 9 | 10 | 0 | 2880 | NaN | Butler | Waukesha | WI | 73921.5 | 73033.5 | 73011.0 | 78331.5 | 80385.0 | 83619.0 | 125428.50 |
Last rows
| id | headquarter | location.Code | date_of_establishment | location | loc.details | state | deposit_amount_2011 | deposit_amount_2012 | deposit_amount_2013 | deposit_amount_2014 | deposit_amount_2015 | deposit_amount_2016 | deposit_amount_2017 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3780 | 3781 | 0 | 7980 | 2016-03-11 | Laguna Niguel | Orange | CA | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 3781 | 3782 | 0 | 7981 | 2016-07-11 | Oklahoma City | Oklahoma | OK | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 3782 | 3783 | 0 | 7983 | NaN | Tampa | Hillsborough | FL | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 3783 | 3784 | 0 | 7984 | 2017-10-05 | Jacksonville | Duval | FL | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 3784 | 3785 | 0 | 7985 | NaN | Woodland Hills | Los Angeles | CA | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 3785 | 3786 | 0 | 7986 | 2016-03-10 | Compton | Los Angeles | CA | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 3786 | 3787 | 0 | 7987 | 2017-02-01 | Las Vegas | Clark | NV | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 3787 | 3788 | 0 | 7989 | NaN | Irvine | Orange | CA | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 3788 | 3789 | 0 | 7993 | 2016-12-31 | New Orleans | Orleans | LA | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 3789 | 3790 | 0 | 7994 | 2016-12-31 | Buffalo | Erie | NY | NaN | NaN | NaN | NaN | NaN | NaN | NaN |